A Study and Comparative Analysis of Different Stemmer and Character Recognition Algorithms for Indian Gujarati Script

نویسندگان

  • Rajnish M. Rakholia
  • Jatinderkumar R. Saini
چکیده

A lot of work has been reported on optical character recognition for various non-Indian scripts like Chinese, English and Japanese and Indian scripts like Tamil, Hindi Telugu, etc. , in this paper, we present a literature review on stemmer, optical character recognition (OCR) and Text mining work on Indian scripts, mainly on the Gujarati languages. We have discussed the different techniques for OCR and text mining in Gujarati scripts, and summarized most of the published work on this topic and gives future directions of research in the field of Indian script.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Script Identification from Bilingual Gujarati-English Documents

In a multi-lingual country like India, in most of the official papers, school text books, magazines, it is observed that English words intersperse within the Indian regional languages. So a bilingual Optical Character Recognition (OCR) system is needed which can recognize these bilingual documents and store it for future use. In this paper authors present an OCR system developed for the script ...

متن کامل

Gujarati Character Identification: A Survey

English Character Recognition techniques have been studied extensively in the last two decades and it gain unbelievable high progress and success ratio. But for regional languages these are still emerging and their success ratio is very poor. In Gujarat, there are thousands of people who can speak, write and understand only Gujarati language. Rapid growing computation may increase Indian CR met...

متن کامل

Analysis of structural features and classification of Gujarati consonants for offline character recognition

Wide range of applications and numerous other complexities involved in character recognition (CR) makes it a continuous and open area of research. Feature selection and classification plays major role in achieving higher accuracy for character recognition. In the era of digitization its compelling need to have CR system for regional script. This paper presents analysis of structural features an...

متن کامل

Extraction of Characters and Modifiers from Handwritten Gujarati Words

The research activity related to Optical Character Recognition (OCR) for almost all Indian languages is very less. Gujarati script is one of the scripts for which very less literature is available, as far as OCR activities are concerned. This paper describes one of the important phase of OCR, segmentation of handwritten words into its basic components namely basic characters, conjunct character...

متن کامل

Rotation Estimation Of Gujarati Script Document Using Hough Transform

This paper includes a proposed technique for the Estimation of Skew present in the image of Gujarati Script Document using the Hough Transform technique. It includes simple pre-processing tasks like the Dilation, Erosion, and Thinning. Once these processes are applied the Final image is gone through Hough Transform and a quietly close angle is achieved. It provides promising results when applie...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014